AITopics | imagenet-1k dataset

Collaborating Authors

imagenet-1k dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

M Availability

Neural Information Processing SystemsFeb-18-2026, 18:02:10 GMT

ImageNet++, no toxic images were found, indicating that the dataset's captions are safe.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Appendix [KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training ] Anonymous Author(s) Affiliation Address email Appendix A. Proof of Lemma 1

Neural Information Processing SystemsFeb-14-2026, 22:31:07 GMT

Table 1 summarizes the models and datasets used in this work. ImageNet-1K Deng u. a. (2009): We use the subset of the ImageNet dataset containing DeepCAM Kurth u. a. (2018): DeepCAM dataset for image segmentation, which consists of Fractal-3K Kataoka u. a. (2022) A rendered dataset from the Visual Atom method Kataoka We also use the setting in Kataoka u. a. (2022) Table 2 shows the detail of our hyper-parameters. Specifically, We follow the guideline of'TorchVision' to train the ResNet-50 that uses the CosineLR To show the robustness of KAKURENBO, we also train ResNet-50 with different settings, e.g., ResNet-50 (A) setting, we follow the hyper-parameters reported in Goyal u. a. (2017). It is worth noting that KAKURENBO merely hides samples before the input pipeline. In this section, we present an analysis of the factors affecting KAKURENBO's performance, e.g., the The result shows that our method could dynamically hide the samples at each epoch.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

e5afb0f2dbc6d39b312d7406054cb4c6-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 15:56:28 GMT

accuracy, dataset, negative sample, (17 more...)

Neural Information Processing Systems

Country: Europe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)

Add feedback

M Availability

Neural Information Processing SystemsOct-11-2025, 00:47:56 GMT

ImageNet++, no toxic images were found, indicating that the dataset's captions are safe.

dataset, information, please provide, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A ImageNet Texture

Neural Information Processing SystemsAug-18-2025, 07:04:51 GMT

See Figures 7 and 8 for examples of the ImageNet-Texture dataset and their counterparts in the original ImageNet dataset. Shape is often less well-defined in these classes, for example in window screen and rapeseed. B.1 Comparison of two ways to apply α in NCE loss Since the denominator normalizes the 3 kinds of pairs equally, we only pay attention to the numerator. Because of the exponential tail, it applies a exponentially larger weight to the negatives that are harder. Our patch-based augmentation is also closely related to some of the self-supervised learning methods which solve jigsaw as the pretext task. All of our models are trained on 4 GTX 1080 Ti gpus.

artificial intelligence, inductive learning, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)

Add feedback

5cd6dc946ccc37ae6c9f4fc6b6181e1d-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 03:28:41 GMT

dataset, image model, imagenet-1k dataset, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

FocusDD: Real-World Scene Infusion for Robust Dataset Distillation

Hu, Youbing, Cheng, Yun, Saukh, Olga, Ozdemir, Firat, Lu, Anqi, Cao, Zhiqiang, Li, Zhijun

arXiv.org Artificial IntelligenceJan-10-2025

Dataset distillation has emerged as a strategy to compress real-world datasets for efficient training. However, it struggles with large-scale and high-resolution datasets, limiting its practicality. This paper introduces a novel resolution-independent dataset distillation method Focus ed Dataset Distillation (FocusDD), which achieves diversity and realism in distilled data by identifying key information patches, thereby ensuring the generalization capability of the distilled dataset across different network architectures. Specifically, FocusDD leverages a pre-trained Vision Transformer (ViT) to extract key image patches, which are then synthesized into a single distilled image. These distilled images, which capture multiple targets, are suitable not only for classification tasks but also for dense tasks such as object detection. To further improve the generalization of the distilled dataset, each synthesized image is augmented with a downsampled view of the original image. Experimental results on the ImageNet-1K dataset demonstrate that, with 100 images per class (IPC), ResNet50 and MobileNet-v2 achieve validation accuracies of 71.0% and 62.6%, respectively, outperforming state-of-the-art methods by 2.8% and 4.7%. Notably, FocusDD is the first method to use distilled datasets for object detection tasks. On the COCO2017 dataset, with an IPC of 50, YOLOv11n and YOLOv11s achieve 24.4% and 32.1% mAP, respectively, further validating the effectiveness of our approach.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.06405

Country: Europe > Austria (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry:

Education (0.46)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MILAN: Masked Image Pretraining on Language Assisted Representation

Hou, Zejiang, Sun, Fei, Chen, Yen-Kuang, Xie, Yuan, Kung, Sun-Yuan

arXiv.org Artificial IntelligenceDec-19-2022

Self-attention based transformer models have been dominating many computer vision tasks in the past few years. Their superb model qualities heavily depend on the excessively large labeled image datasets. In order to reduce the reliance on large labeled datasets, reconstruction based masked autoencoders are gaining popularity, which learn high quality transferable representations from unlabeled images. For the same purpose, recent weakly supervised image pretraining methods explore language supervision from text captions accompanying the images. In this work, we propose masked image pretraining on language assisted representation, dubbed as MILAN. Instead of predicting raw pixels or low level features, our pretraining objective is to reconstruct the image features with substantial semantic signals that are obtained using caption supervision. Moreover, to accommodate our reconstruction target, we propose a more effective prompting decoder architecture and a semantic aware mask sampling mechanism, which further advance the transfer performance of the pretrained model. Experimental results demonstrate that MILAN delivers higher accuracy than the previous works. When the masked autoencoder is pretrained and finetuned on ImageNet-1K dataset with an input resolution of 224x224, MILAN achieves a top-1 accuracy of 85.4% on ViT-Base, surpassing previous state-of-the-arts by 1%. In the downstream semantic segmentation task, MILAN achieves 52.7 mIoU using ViT-Base on ADE20K dataset, outperforming previous masked pretraining results by 4 points.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2208.06049

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Global Context Vision Transformers -- Nvidia's new SOTA Image Model

#artificialintelligenceSep-6-2022, 19:00:23 GMT

Nvidia has recently published a new vision transformer, titled the Global Context Vision Transformer (GC ViT) (Hatamizadeh et al., 2022). GC ViT introduced a novel architecture that leverages both global attention and local attention, allowing it to model both short-range and long-range spatial interactions. The clever techniques used by the Nvidia researchers enabled GC ViT to model global attention while avoiding expensive computations. GC ViT achieves state-of-the-art (SOTA) results in the ImageNet-1K dataset, surpassing the Swin Transformer by a significant margin. In this article, we will take a closer look at the inner workings of GC ViT, and the techniques that enabled it to achieve such results.

interaction, swin transformer, transformer, (13 more...)

#artificialintelligence

Industry: Information Technology > Hardware (0.82)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback